APPENDIX 



Copyright 31996, Paul Nelson 



# 4 « 

Time-Dimensional Tables 

One of the biggest problems facing database developers is the representation of historic 
and future-dated information in the tables of b relational database. Typically, a type of table 
referred to as a "versioned table", which his an effective date as the last column of its 
primary key, is created to capture date-sensiiive information. Although relational database 
management systems are capable of capturirlg date-sensitive information in its most basic 
form through the use of versioned tables, they are unable to adequately process such 
information. Retrieval of rows from a versionfed table is a complex matter usually involving 
the use of subquery logic. Inexperienced requesters are unable to create such statements 
and a significant amount of development time is required of those requesters who are 
experienced. In addition, referential integrity cannot be appropriately enforced for versioned 
tables. In order to validate referential constraints, the DBMS currently matches the values 
of dependent table columns with the value of tile primary key of the parent table. The value 
of the effective date column in the primary keylof a versioned table represents only the first 
in a range of dates for which the information ineffective. This range of dates is referred to 
as an "effective window". Referential constraints can be enforced for the value of the 
effective date, but not for the entire effective window. Because of these problems, 
database developers often avoid creating versioned tables to hold date-sensitive information. 
In many cases, business requirements which dictate the use of versioned tables are ignored. 
There have been many methods devised whichlavoid some of the problems inherent with 
versioned tables, but none of these methods JSu^^a^dress the problems. Some of the 
methods actually introduce additional probj^ms. Ahe DBMS could be modified to 
accommodate date-sensitive information in A way tnat would allow for simple retrieval 
against versioned tables and which would also enforce referential constraints on an effective 
window basis. This method termed "time-pimenpi^nal tables" is described in the remainder 
of this article. I Y 

Time-dimensional tables are defined td thy^tabasejrjnanagement system in much the 
same way that versioned tables are curVen^y4<fSfine^ JThe phrase "TABLE IS TIME- 
DIMENSIONAL" is included in the definition of al time-dimensional table. When the DBMS 
encounters this phrase, it marks the tablaas time-dimensional. The last column in the 
primary key is recognized by the DBMS as tfee effective sfert date column for the table. All 
other columns in the physical primary key Vorml whaj/will be referred to as the "logical 
primary key" of the table. The logical primary kel should be recognized as the identifier for 
the information contained in a time-dimensional table. It is possible that several rows exist 
in a table with the same logical primary key. >jfcach of these rows represent the same 
information at various points in time. The effe^tA/e start date column allows these related 
rows to be differentiated. Thus, an additional joimlnsion, the "time dimension" is recognized 
by the DBMS. I 

In many cases, a column is included in a versidned table to indicate the date on which a 
row is no longer effective. This would be specified during the definition of a time- 
dimensional table by including the phrase "EFFECTIVE END COLUMN IS column-name". The 
column named in the phrase cannot participate in the primary key of the table since it is not 
required to uniquely identify versions. The values of the effective start date column and the 
effective end date column are used by the DBMS* to determine the effective window for 
each row of a time-dimensional table. The range ot dates in the effective window of a row 
will be determined as all dates from (and including) the effective start date until (but not 
including) the effective end date. If a table has notlbeen defined with an effective end date 
or a row contains a null entry in its effective end date column, the effective end date will be 
defaulted. First, the DBMS will attempt to retrieve mext chronological row of the table with 
the same logical primary key. If a row is found, thd effective start date of that row will be 
used as the default of the effective end date of trie original row. By defaulting in this 
manner, gaps between the effective dates are prevented. If there are no rows with the 
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same logical primary key and a later effective 
is defaulted to the highest possible date valu 
dates that fall within its effective window. A 
on all dates prior to its effective window. A 
following its effective window. 

A predicate is added to the WHERE clause 
from time-dimensional tables. The format of thje 
table 1 . The (table-name) clause is optional 
which the predicate applies. If the (table-nam|e 
for all time dimensional tables referenced in 
optional and is added merely for the sake of cl*i 
is the only clause that is required and will be uj 
a time-dimensional table are to be included 
status. If ACTIVE is specified, ail rows which 
date will be included. If INACTIVE is specified 
no longer current on the target date will be 



start date, the value of the effective end date 
5. A row is considered to be "active" on all 
row is considered to be "pending activation" 
w is considered to be "inactive" on all dates 



the SQL to simplify requests for information 
new time-dimensional predicate is shown in 
nd identifies the time-dimensional table for 
clause is omitted, the predicate will apply 
request. The (IS/WAS/WILL BE) clause is 
ty. The ACTIVE/INACTIVE/PENDING clause 
ed by the DBMS to determine which rows of 
the result set based upon their effective 
represent current information on the target 
all rows which represent information that is 
ncluded. If PENDING is specified, all rows 



which represent information that is not yet cun ent on the target date will be included. 
(NOT) clause is optional and indicates th it the effective status specified in 



The 
the 

ACTIVE/INACTIVE/PENDING clause is to be usdd to exclude rather than include rows. The 
(ON date) clause is optional and is used to >pe£tfy>a target date for time dimensional 
processing. The "(BETWEEN start-date AND er Relate)] clause is also optional and is used 
to specify a range of target dates for time-di^ lensional processing. When this clause is 
included in the predicate, the range of target aa{es wil/be determined as all dates from {and 
including) the date specified in the start-date 
specified in the end-date parameter. The (ON 
AND end-date) clause may not be used in wie 
target date specification clause is included, ire 
target date for the predicate. 



parafneter until (and excluding) the date 
datd) clause and the (BETWEEN start-date 
e time^imensional predicate. If neither 
irre«ir*system date will be defaulted as the 



TABLE 1 \ \ 




WHERE \ \ 




(table -name) \ 




(IS/WAS/ WILL BE) (NOT) ACTIVE / INACT9 


JVE/ PENDING 


(ON date) (BETWEEN start-date AND er 


M-£ate) 



Since the complex logic required to retrieve j 
and predictable, the function of creating this 
internalizing this logic, an opportunity for efficie] 
relieving the requester of the complicated tas 
retrieval criteria must be included in a request, 
DBMS from the requester through the use of t 
expands the predicate using information defined 
the request. 



ws from time-dimensional tables is static 
[ogic is internalized into the DBMS. By 
cy maximization is created in addition to 
Only the most basic time-dimensional 
This basic information is passed to the 
time-dimensional predicate. The DBMS 
or time-dimensional tables and completes 



TIME-DIMENSIONAL EXAMPLE 1 (retrieving curreht information) 

Figure 1 shows two tables of an example database. Assume that the EMPL_PAY_RATE 
table has been defined as a time-dimensional table. The DBMS would recognize the 
PA YE F F_D AT E as the effective start date of the table since it is the last column in the 
physical primary key. The EMPLJD column wtuld be recognized as the table's logical 
primary key. The E M P L_P A Y_R AT E table can be used to maintain several different versions 
of an individual employee's pay rate. The different versions would be identified by the 
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values of the P A Y_E F F_D AT E column, 
tracked in the EMPL_PAY_RATE table. 



Historical, current and future pay rates may all be 



FIGURE 1 
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Table 2 shows two requests which retrieve employee names and the current rate of pay 



for each employee from the example tablos 
includes the complex subquery logic required 
titled "TIME-DIM EM SIGNAL METHOD" uses 
dimensional retrieval criteria to the DBMS 
DBMS only return rows which are currently 
DBMS will use the information previously de 



missing pieces of the request. The result sets from both requests will be identical. The 



table name included in the time-dimensional 
EM PL_PAY_RATE table is the only time-dimi 



TABLE 2, 



CURRENT METHOD: 

SELECT NAME, 

PAY_RATE 
FROM EMPLOYEE EMPL, 

EMPL_PAY_RATE PAY 
WHERE EMPL.EMPL_ID = PA*} 

AND PAY . PAY_EFF_DATE = 

(SELECT MAX ( PAY2 . PAY_EFI 
FROM E MP L_P AY_RAT E PAS 
WHERE PAY 2 . EMPL_ID 

AND PAY2 . PAY_EFF_DATE 

TIME-DIMENSIONAL METHOD: 

SELECT NAME, 

PAY_RATE 
FROM EMPLOYEE EMPL, 

EMPL_PAY_RATE PAY 
WHERE EMPL . EMPL_ID = PAY . EMPL_* 
AND EMPL PAY RATE IS ACTIVE 



The request titled "CURRENT METHOD" 
to process the versioned tables. The request 
the time-dimensional predicate to pass time- 
The predicate in this case requests that the 
active from the EM PL_PAY_RATE table. The 
ned for the time-dimensional table to fill in the 



e shown in table 2 is optional since the 
sionatftable referenced in the request. 




TIME-DIMENSIONAL EXAMPLE 2 (retrieving 



EMPL_ID 
ENT DATE) 



itstoric information using a date range) 



Figure 2 shows two additional tables o1 the example database introduced in Figure 1. 
The EMPL_OCCUP table has been defhed as a time-dimensional table with the 
OCCUP_EFF_DATE identified as its effecth e start date. The logical primary key of the 
EMPL OCCUP table is the EMPL ID column. 
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Table 3 shows two requests which retrie\ 



e employee names and identifies occupations 



worked by each employee anytime during the year 1994. The request titled "CURRENT 



METHOD" contains the complex subquery log 
version titled "TIME-DIMENSIONAL METHOD 



c currently required to perform the task. The 
uses the time-dimensional predicate to pass 



time-dimensional criteria to the DBMS. The tine-dimensional predicate shown specifies that 
the DBMS should only return rows from time-cimensional tables which were active between 
January 1st, 1994 and January 1st, 1995.1 The EMPL_OCCUP table is the only time- 
dimensional table referenced in the request, so the DBMS will assume that the predicate 
applies to that table only. The EMPL_OCCUP lable definition will be accessed by the DBMS 
to determine other information required to complete the request. 



TABLE 3 



CURRENT METHOD: 



SELECT NAME, 

OCCUP_ID 
EMPLOYEE EMPL, 
EMPLJDCCUP OCCUP 
EMPL . EMPL_ID 
OCCUP. OCCUP__EFF_DATE < 
( OCCUP. OCCUP_EFF_DATE >= 
( SELECT MAX (OCCUP2 . OCCUP 
FROM EMPL_OCCUP OCCUP 
WHERE OCCUP2 . EMPL_ID 

AND OCCUP2 . OCCUP_EFF 
NOT EXISTS 

(SELECT MAX(OCCUP2 ,OCCUP_: 
FROM EMPLJDCCUP OCCUP2 
WHERE OCCUP2 . EMPL_ID 
AND * OCCUP2 . OCCUP_EFF__ 

TIME-DIMENSIONAL METHOD: 



FROM 

WHERE 
AND 
AND 



OR 




L_ID 
/1994' ) 



L.EMPL_ID 
01/01/1994 ' ) ) 



SELECT NAME, 

OCCUP_ID 
FROM EMPLOYEE EMPL, 

EMPLJDCCUP OCCUP 
WHERE EMPL . EMPL_ID = OCCUP . EMPL_ID 

AND ACTIVE BETWEEN 1 01/01/1994 ' P$ XO 1 01/01/199 5 1 



TIME-DIMENSIONAL REFERENTIAL INTEGRITY 

Exceptions must be made in order to enforce referential constraints for time-dimensional 
tables. In any case where the parent table in a referential constraint is time-dimensional, the 
columns defined in the constraint must correspond to the logical primary key of the parent 
table. In other words, the effective start date column of time-dimensional tables is not 
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included in referential constraints. This represents a departure from the current method 
which enforces referential constraints based on tie entire physical primary key of the parent 
table. When validating a time-dimensional refere nial constraint, any row in the parent table 
which has a logical primary key equal to the foreign key of the dependent row will satisfy 
the constraint. It is possible that several rows of the parent table would satisfy the 
constraint. 

In many cases, it will be necessary to ensiure 
active for a specific range of dates defined by 
situation, the referential constraint definition would include the phrase 'EFFECTIVE START 
DATE IS dependent-table-column". An effective end date may also be defined for a 
referential constraint by including the phrase 



that a time-dimensional parent row is 
columns of the dependent table. In this 



EFFECTIVE END DATE IS dependent-table- 



column" in its definition. An effective window 1 Dr each dependent row is determined in the 



same manner that effective windows are defined 
referential constraint which includes effective da 
rows whose logical primary keys match the foreign key of the dependent table must be 
effective for the entire range of dates identified i i thejaffecti^p window 



CONCLUSION 



for time-dimensional tables. In order for a 
tes to be satisfied, one or more parent table 



di scri 



agejnent 
ndini 



The method of time-dimensional table 
problems that exist concerning the maintei 
relational and object-oriented database 
capable of maintaining historical, active 
constraints would be enforced in an api 
process of retrieving versioned informatio 
Other features could also be included to 1 
could be modified to automatically preven 
table from overlapping. The data types 
limited to date data types. The effective 
versioned table could be of any data type 
include a time, number data types, and som 
control. 

If the method of time-dimensional tables 
the process of maintaining and retrieving vers 
point that developers would no longer be afrai' 
databases. 



his article solves most of the 
and retrieval of versioned information using 
ems. Individual tables would be 
ersions of information. Referential 
ner foj^efstfcned information. The 
bas>e<would become relatively simple, 
tionality. /or instance, the DBMS 
fve windows &f the rows of a versioned 
version conftrol would not have to be 
column and t#e effective end column of a 
has an implied order. Data types which 
racter daja types could be used for version 

ribedy&ere to be implemented in a DBMS, 
ned information might be simplified to the 
t</ include versioned information in their 



